Author Details

Scroll

Refine your search

Collections

Engineering Collection

Co-Authors

Journals

Year

Authors

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All

Senthamarai Kannan, S.

Data Extraction in Web Databases by Combining Tag and Value Similarity

Abstract Views :167 | PDF Views:2

Authors

J. Deepika ¹, S. Senthamarai Kannan ¹

Affiliations
1 Sethu Institute of Technology, IN

Source

Data Mining and Knowledge Engineering, Vol 5, No 4 (2013), Pagination: 156-161

Abstract

In real time applications, identification of records that represent the same real-world entity is a major challenge to be solved. Detection and removal of duplicate records that relate to the same entity within one dataset is an important task in data preprocessing. The novel data extraction and alignment method called CTVS that combines both tag and value similarity is enhanced by using unsupervised duplicate detection algorithm (UDD) to eliminate the duplicate records in web databases. CTVS automatically extracts data from query result pages by first identifying and segmenting the query result records (QRRs) in the query result pages and then aligning the segmented QRRs into a table, in which the data values from the same attribute are put into the same column. Specifically, new techniques are proposed to handle the case when the QRRs are not contiguous, which may be due to the presence of auxiliary information, such as a comment, recommendation or advertisement, and for handling any nested structure that may exist in the QRRs. Also a new record alignment algorithm that aligns the attributes in a record, first pairwise and then holistically, by combining the tag and data value similarity information is designed.

Keywords

Automatic Wrapper Generation, Data Extraction, Data Record Alignment, Duplicate Detection.

Full Text

Improvisation of Clustering by Attribute Reduction Using Bayesian Theorem

Abstract Views :178 | PDF Views:2

Authors

S. Senthamarai Kannan ¹, N. Ramaraj ², S. Baskar ³

Affiliations
1 Department of IT, Thiagarajar College of Engineering, Madurai, IN
2 G.K.M College of Engineering, Chennai, IN
3 EEE Department, Thiagarajar College of Engineering, Madurai, IN

Source

Data Mining and Knowledge Engineering, Vol 1, No 4 (2009), Pagination: 162-166

Abstract

Data reduction aims to reduce the dimensionality of large scale data with out losing useful information, is an important topic of knowledge discovery, data clustering and classification. This Paper introduces a novel concept of dependency based attribute reduction using Bayes Theorem. Bayesian Theory is of great interest in Data reduction. Attribute reduction is a data mining approach for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable. The basis of this attribute reduction is a method that converts two or more variables or attributes to a single attribute and by calculating the probabilities of their values in deciding the value of class attribute. Hence, the dependent attributes are found and are removed from the original dataset. The end goal is to improve the classification accuracy such that prediction of the class variable is improved over that of the original data with initial attribute set and also reduces the computational time.

Keywords

Attributes Reduction, Data Classification, Bayesian Theory, Clustering, Simple K-Means, Cobweb, EM.

Full Text

Affinity Propagation Based Algorithm for Optimal K-Means Clustering

Abstract Views :202 | PDF Views:4

Authors

S. Senthamarai Kannan ¹, N. Ramaraj ², S. Baskar ³

Affiliations
1 Department of I.T, Thiagarajar College of Engineering, Madurai, IN
2 G.K.M College of Engineering, Chennai, Tamil Nadu, IN
3 E.E.E Department, Thiagarajar College of Engineering, Madurai, IN

Source

Artificial Intelligent Systems and Machine Learning, Vol 1, No 4 (2009), Pagination: 147-151

Abstract

K-means clustering is widely used due to its fast convergence, but it is sensitive to the initial condition. The limitation of k-means algorithm is that the user has to specify the number of clusters (K). There are some methods to initialize the number of clusters. But those methods perform worse in some cases. So we are proposing a method called affinity propagation in this paper which resolves those problems. By making use of the convergence property of K-means and the good performance of affinity propagation, we presented a new clustering strategy which can produce much lower squared error than AP and standard k-means. The efficiency and effectiveness of our method is demonstrated through extensive comparisons with other methods using UCI datasets of high dimensionality.

Keywords

K-Means, Affinity Propagation, Centroid Initialization, Clustering Optimization.

Informatics Publishing Limited

Author Details

Senthamarai Kannan, S.

Data Extraction in Web Databases by Combining Tag and Value Similarity

Authors

Source

Abstract

Keywords

Full Text

Improvisation of Clustering by Attribute Reduction Using Bayesian Theorem

Authors

Source

Abstract

Keywords

Full Text

Affinity Propagation Based Algorithm for Optimal K-Means Clustering

Authors

Source

Abstract

Keywords

Full Text

Syzygies of Some GIT Quotients

Authors

Source

Abstract

Full Text

Username
Password
Remember me